Refactor Dataset.map to merge attrs instead of copying #11020

max-sixty · 2025-12-16T05:43:52Z

I'm not sure this is the best solution for the specific case, but it's at least consistent with our existing behavior (I think? not super confident), and is quite reasonable behavior.

other options:

drop all attrs on keep_attrs=False
use a dict-like merge on keep_attrs=True

Changes:

Dataset.map() / DataTree.map(): When keep_attrs=True, merge attrs from function result and original using drop_conflicts (matching attrs kept, conflicting attrs dropped). When keep_attrs=False, leave attrs as the function returned them.
Weighted operations: Explicitly clear attrs when keep_attrs=False, since internal computations (like dot) propagate attrs from weights.

Closes Problem using assign_attrs() in map() #11019
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst
New functions/methods are listed in api.rst

Update the `keep_attrs` behavior in `Dataset.map()` and `DataTree.map()` to merge attributes from the original and function results using the `drop_conflicts` strategy, rather than unconditionally copying original attrs. When `keep_attrs=True`, matching attrs are kept and conflicting attrs are dropped. When `keep_attrs=False`, only attrs set by the function are retained. Add comprehensive tests for the new attr merging behavior.

Weighted operations internally propagate attrs from weights through computations like dot(). When keep_attrs=False is passed, users expect no attrs on the result, but attrs from weights were leaking through. Clear attrs explicitly in _implementation when keep_attrs is False. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

The `test-nightly` environment uses pandas nightly wheels from PyPI, which currently don't have win-64 builds available. This causes `pixi lock` to fail when solving for all platforms. RTD builds fail because they have no lock file cache (unlike GitHub Actions CI which caches pixi.lock). When RTD runs `pixi install -e doc`, pixi must generate the lock file from scratch, which fails on the unsolvable test-nightly/win-64 combination. This restriction can be removed once pandas nightly provides win-64 wheels again. Co-authored-by: Claude <[email protected]>

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

dopplershift · 2025-12-18T00:41:16Z

I'll pipe in to say that this PR greatly reduces the number of test failures MetPy has with the latest xarray, so it's a 👍 from me for what that's worth.

keewis · 2025-12-18T11:23:35Z

xarray/core/dataset.py

        if keep_attrs:
+            # Merge attrs from function result and original, dropping conflicts
+            from xarray.structure.merge import merge_attrs
+
            for k, v in variables.items():
-                v._copy_attrs_from(self.data_vars[k])
+                v.attrs = merge_attrs(
+                    [v.attrs, self.data_vars[k].attrs], "drop_conflicts"
+                )
            for k, v in coords.items():
                if k in self.coords:
-                    v._copy_attrs_from(self.coords[k])
-        else:
-            for v in variables.values():
-                v.attrs = {}
-            for v in coords.values():
-                v.attrs = {}
+                    v.attrs = merge_attrs(
+                        [v.attrs, self.coords[k].attrs], "drop_conflicts"
+                    )
+        # When keep_attrs=False, leave attrs as the function returned them


The problem with interpreting keep_attrs=False as "leave the attrs as returned by the function" is that that means we don't have keep_attrs="drop" anymore.

I'd argue that keep_attrs=True should be closer to what you're proposing for keep_attrs=False, which I do think would be more intuitive.

So instead we may need to consider supporting keep_attrs with strategy names / a strategy function, like apply_ufunc does. That would still allow you to choose "drop_conflicts" if preferred (or maybe as the default? Not sure), while not changing behavior too drastically.

yes, the proposed code treats keep_attrs=False as "remove all the input attrs". but not "remove all the output attrs".

@keewis can you see a reasonable change to fix the immediate issue without adding a whole strategy to keep_attrs? I don't have a particularly strong view on this specific implementation, but it does seem reasonable / logical, and it does let us solve this immediate bug...

(zooming out — as I mentioned before, for me the best "blank-slate" implementation for keep_attrs is to mostly not have a the option at all, and folks can drop attrs if they want. though I agree with you that merging is case that neither approach handles well...)

the proposed code treats keep_attrs=False as "remove all the input attrs". but not "remove all the output attrs".

keep_attrs as a boolean is always ambiguous. For instance in #10997 I arrived at the conclusion that keep_attrs=False should drop all attrs from the output.

It seems like it wouldn't be too hard to make this more configurable by just passing any string that is passed to keep_args along to merge_attrs like @keewis is suggesting. You would just map True to "override" and False to "drop" (I think that's what the behavior is now on main) and then people who want something else can use a specific string.

dopplershift · 2026-01-16T00:02:46Z

Is there anything this is waiting on? I'm trying to decide if I need to vendor our own implementation of map() for MetPy or if there's hope of something like this PR will land.

max-sixty · 2026-01-16T01:23:10Z

@pydata/xarray any thoughts? reasons not to move forward?

xarray/computation/weighted.py

jsignell · 2026-01-16T15:57:36Z

xarray/core/dataset.py

        if keep_attrs:
+            # Merge attrs from function result and original, dropping conflicts
+            from xarray.structure.merge import merge_attrs
+
            for k, v in variables.items():
-                v._copy_attrs_from(self.data_vars[k])
+                v.attrs = merge_attrs(
+                    [v.attrs, self.data_vars[k].attrs], "drop_conflicts"
+                )
            for k, v in coords.items():
                if k in self.coords:
-                    v._copy_attrs_from(self.coords[k])
-        else:
-            for v in variables.values():
-                v.attrs = {}
-            for v in coords.values():
-                v.attrs = {}
+                    v.attrs = merge_attrs(
+                        [v.attrs, self.coords[k].attrs], "drop_conflicts"
+                    )
+        # When keep_attrs=False, leave attrs as the function returned them


the proposed code treats keep_attrs=False as "remove all the input attrs". but not "remove all the output attrs".

keep_attrs as a boolean is always ambiguous. For instance in #10997 I arrived at the conclusion that keep_attrs=False should drop all attrs from the output.

It seems like it wouldn't be too hard to make this more configurable by just passing any string that is passed to keep_args along to merge_attrs like @keewis is suggesting. You would just map True to "override" and False to "drop" (I think that's what the behavior is now on main) and then people who want something else can use a specific string.

xarray/core/dataset.py

Address PR feedback to also clear coordinate attrs (not just data_vars attrs) when keep_attrs=False in both DataArrayWeighted and DatasetWeighted. Added test to verify coord attrs are cleared for both DataArray and Dataset. Co-authored-by: Claude <[email protected]>

Use proper type annotation with DataArray | Dataset union type to avoid incompatible assignment error. Co-authored-by: Claude <[email protected]>

github-actions bot added the topic-DataTree Related to the implementation of a DataTree class label Dec 16, 2025

max-sixty and others added 4 commits December 15, 2025 22:19

Merge remote-tracking branch 'origin/rtd' into attrs

89010a8

Add whats-new entry for Dataset.map attrs behavior

f9a5669

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

max-sixty added the needs review label Dec 18, 2025

keewis reviewed Dec 20, 2025

View reviewed changes

dopplershift mentioned this pull request Dec 22, 2025

Incompatibility with xarray 2025.12.0 Unidata/MetPy#3978

Open

Merge branch 'main' into attrs

6645853

jsignell reviewed Jan 16, 2026

View reviewed changes

max-sixty and others added 3 commits January 16, 2026 11:36

Merge branch 'main' into attrs

05f42d7

Fix mypy type error in test_weighted_operations_drop_coord_attrs

ec5d6fc

Use proper type annotation with DataArray | Dataset union type to avoid incompatible assignment error. Co-authored-by: Claude <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Refactor Dataset.map to merge attrs instead of copying #11020

Refactor Dataset.map to merge attrs instead of copying #11020

max-sixty commented Dec 16, 2025 •

edited

Loading

Uh oh!

dopplershift commented Dec 18, 2025

Uh oh!

keewis Dec 18, 2025

Uh oh!

max-sixty Dec 20, 2025

Uh oh!

max-sixty Dec 20, 2025

Uh oh!

jsignell Jan 16, 2026

Uh oh!

dopplershift commented Jan 16, 2026

Uh oh!

max-sixty commented Jan 16, 2026

Uh oh!

Uh oh!

jsignell Jan 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Refactor Dataset.map to merge attrs instead of copying #11020

Are you sure you want to change the base?

Refactor Dataset.map to merge attrs instead of copying #11020

Conversation

max-sixty commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dopplershift commented Dec 18, 2025

Uh oh!

keewis Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

max-sixty Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

max-sixty Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

jsignell Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

dopplershift commented Jan 16, 2026

Uh oh!

max-sixty commented Jan 16, 2026

Uh oh!

Uh oh!

jsignell Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

max-sixty commented Dec 16, 2025 •

edited

Loading